• Hugging Face’s popular and powerful inference server was briefly licensed under non-commercial to try and prevent larger businesses from hosting a competitive version. This didn't lead to improved business outcomes but did diminish community involvement. It has reverted to a more permissive license.

    Tuesday, April 9, 2024
  • CodeGemma, a collaborative effort between Google and Hugging Face, is a family of open-access, code-specialized LLMs. The models are available in 2B and 7B configurations. They have been fine-tuned with a lot of data to bring better performance in logical reasoning and programming suggestions.

  • Hugging Face is committing $10 million in free shared GPUs to help developers, academics, and startups create new AI technologies, aiming to counteract the centralization of AI advancements dominated by tech giants.

  • TRL is a Hugging Face library for training transformers with reinforcement learning. This example allows you to also perform the same process for vision-based language models like LLaVA.

  • The Hugging Face team has released an extremely powerful and strong math model.

  • This article looks at how Hugging Face approaches developer relations. Hugging Face is well known for being a startup whose large online community also translates to massive multi-thousand-people meetups all over the world. The community-centric company prioritizes giving the spotlight to community members and collaborators as much as possible and provides compute and no-strings-attached cash grants to community members and communities. It also helps maintain open-source libraries and closely collaborates with other tools. The company aims to always take a collaborative approach to working with other groups, open-source platforms, and libraries.

  • Hugging Face serves and stores a lot of data, most of it in LFS. XetHub has written its own, powerful, alternative for scaling Git repositories.

  • Powered by phi-3-mini, this space uses a rarity prompt to generate data about any topic. It isn't the most accurate, but it is fascinating and powerful.

  • Heralax/Mistrilitary-7b is a specialized model hosted on Hugging Face, designed for text generation with a focus on factual question answering. It is built upon the US Army field manuals, which are publicly available, making it particularly relevant for users seeking information related to military protocols and procedures. The model's name reflects its military focus, although the creator humorously notes the challenges of naming in the tech field. The model is characterized by its narrow focus on question-answering tasks, which may limit its flexibility in handling more open-ended queries. The training process involved using a smaller model, Mistral NeMo, to generate the data, which helped maintain quality while reducing costs. The creator recommends using a low temperature setting for optimal performance, indicating that the model is best suited for straightforward, factual responses rather than creative or expansive dialogue. The training of Mistrilitary-7b involved specific hyperparameters, including a learning rate of 2e-05, a train batch size of 2, and a total of 6 epochs. The model was trained using a multi-GPU setup, which allowed for efficient processing and training across five devices. The optimizer used was Adam, with a cosine learning rate scheduler to adjust the learning rate over time. In terms of technical specifications, Mistrilitary-7b has 7.24 billion parameters and is based on the Llama architecture. It supports both 8-bit and 16-bit quantization formats, making it versatile for different deployment scenarios. However, the model has not yet gained enough traction for deployment via the Inference API, suggesting that it may still be in the early stages of community engagement. Overall, Mistrilitary-7b stands out as a domain-specific model tailored for military-related inquiries, leveraging extensive training on relevant texts to provide accurate and reliable answers. Its development reflects a commitment to creating specialized AI tools that serve particular informational needs, particularly in the context of military operations and guidelines.

  • The article discusses the development and performance of a set of tiny test models trained on the ImageNet-1k dataset, created by Ross Wightman and published on Hugging Face. These models represent various popular architecture families and are designed for quick verification of model functionality, allowing users to download pretrained weights and run inference efficiently, even on less powerful hardware. The models are characterized by their smaller size, lower default resolution, and reduced complexity, typically featuring only one block per stage and narrow widths. They were trained using a recent recipe adapted from MobileNet-v4, which is effective for maximizing accuracy in smaller models. While the top-1 accuracy scores of these models may not be particularly impressive, they are noted for their potential effectiveness in fine-tuning for smaller datasets and applications that require reduced computational resources, such as embedded systems or reinforcement learning tasks. The article provides a detailed summary of the models' performance metrics, including top-1 and top-5 accuracy scores, parameter counts, and throughput rates at a resolution of 160x160 pixels. The results indicate that the models, while small, can still achieve reasonable accuracy levels, with some models performing better at a slightly higher resolution of 192x192 pixels. Additionally, the article outlines the throughput performance of the models when compiled with PyTorch 2.4.1 on an RTX4090 GPU, showcasing the number of inference and training samples processed per second under different compilation modes. This data highlights the efficiency of the models in terms of speed, which is crucial for real-time applications. The article also delves into the unique architectural variations of the models, providing insights into their design and the specific components used in each. For instance, the ByobNet combines elements from EfficientNet, ResNet, and DarkNet, while the ConvNeXt models utilize depth-wise convolutions and different activation functions. The EfficientNet models are noted for their use of various normalization techniques, including BatchNorm, GroupNorm, and LayerNorm. Overall, the article invites the community to explore potential applications for these tiny test models beyond mere testing, emphasizing their versatility and the innovative approaches taken in their design.